Introduction
This report presents trajectory modification results using the current implementation of the FAMAIL algorithm with the pseudo-causal fairness formulation (referred to as the "baseline" g(D) approach). The results demonstrate the algorithm's ability to improve both spatial and causal fairness while maintaining behavioral fidelity.
Algorithm Overview
Two-Phase Modification Pipeline
Phase 1: Attribution-Based Trajectory Selection
Not all trajectories contribute equally to unfairness. The attribution phase selects the top-k highest-impact trajectories using a combined Local Inequality Score (LIS) and Demand-Conditional Deviation (DCD):
where $\widetilde{\text{LIS}}$ and $\widetilde{\text{DCD}}$ are normalized to [0, 1], and $w_{\text{LIS}} = w_{\text{DCD}} = 0.5$ by default.
Phase 2: ST-iFGSM Gradient-Based Modification
For each selected trajectory, the algorithm iteratively perturbs the pickup location to maximize the combined objective:
Objective Function Terms
| Term | Formula | Interpretation |
|---|---|---|
| $F_{\text{spatial}}$ | $1 - \frac{1}{2}(G_{\text{DSR}} + G_{\text{ASR}})$ | Gini-based spatial equality (1 = perfect equity) |
| $F_{\text{causal}}$ | $\max(0, 1 - \frac{\text{Var}(R)}{\text{Var}(Y)})$ where $R = Y - g(D)$ | R²-based demand proportionality (1 = all variance explained by demand) |
| $F_{\text{fidelity}}$ | $f_{\text{disc}}(\tau, \tau')$ (ST-SiameseNet) | Discriminator similarity score (1 = indistinguishable from original) |
Soft Cell Assignment
The algorithm uses Gaussian soft cell assignment to enable differentiable gradient flow from discrete cell counts to continuous pickup locations:
Temperature annealing ($\tau: 1.0 \to 0.1$) transitions from soft (smooth gradients) to hard (precise) assignment over the course of optimization.
Experiment Configuration
| Parameter | Value | Description |
|---|---|---|
| Convergence threshold (θ) | 1.0e-6 | Minimum objective change to continue iteration |
| Epsilon (ε) | 3.0 | Maximum perturbation per dimension (grid cells) |
| Alpha (α) | 0.10 | ST-iFGSM step size |
| Max iterations (T) | 50 | Maximum iterations per trajectory |
| α₁ (Spatial weight) | 0.33 | Objective function weights (normalized sum = 1) |
| α₂ (Causal weight) | 0.33 | |
| α₃ (Fidelity weight) | 0.34 | |
| Selection mode | Top-k by Fairness Impact | Attribution-based selection using LIS + DCD |
| k (trajectories selected) | 10 | Number of high-impact trajectories to modify |
| Discriminator checkpoint | pass-seek_5000-20000_(84ident_72same_44diff)/best.pt | Best-performing discriminator model |
| Gradient mode | soft_cell | Differentiable soft cell assignment with annealing |
| Temperature annealing | Enabled (τ: 1.0 → 0.1) | Gradual transition from soft to hard assignment |
Overall Modification Results
Trajectory Visualizations
The following visualizations show the selected trajectories overlaid on the 48×90 study area grid. Each figure contains two panels: the left panel shows the trajectories at their original (pre-modification) pickup locations, while the right panel shows the trajectories after the ST-iFGSM algorithm has perturbed their pickup locations. The background heatmap in each panel shows the corresponding fairness metric value per cell, with the color scale shown below the panels.
Spatial Fairness (Gini-based DSR)
The heatmap below encodes the spatial fairness landscape: cells with higher values (warmer colors) exhibit greater demand–service imbalance. The modification algorithm shifts pickup locations away from over-served cells toward under-served areas, aiming to equalize the Demand-Service Ratio across the grid.
Causal Fairness (Demand-Conditional)
This visualization uses the same layout but overlays trajectories on the causal fairness heatmap, which reflects the residual $R = Y - g(D)$ per cell. Cells with higher residuals (warmer colors) have service levels that deviate more from what demand alone would predict. The algorithm attempts to reduce these residuals by redistributing pickups toward demand-proportional patterns.
Modification Details
For each modified trajectory, the cards below show how the objective function terms changed during optimization. The before values are from the initial evaluation (iteration 1) and the after values are from the final converged state.
Per-Trajectory Fairness Impact
Iteration Details
The following sections show the iteration-by-iteration progress for each modified trajectory. All objective values are shown with full precision to clearly illustrate convergence behavior.
Visualizations
Objective Evolution by Trajectory
Perturbation Magnitude Distribution
Convergence Rate Distribution
Causal Fairness Reformulation
Why Reformulate?
The current "pseudo-causal" fairness formulation uses $F_{\text{causal}} = 1 - \frac{\text{Var}(R)}{\text{Var}(Y)}$ where $R = Y - g(D)$ and $g(D)$ is fitted using demand only. While this measures whether service is proportional to demand, it does not account for demographic factors that may influence both demand and service patterns.
A true causal fairness measure should answer: "After accounting for demand, is there remaining bias associated with demographic characteristics?" To address this, we are developing a reformulated $F_{\text{causal}}$ that incorporates demographic data via a conditional expectation function $g(D, \mathbf{x})$.
Proposed Reformulation Options
Option A1: Demographic Attribution
Trains a conditional model $g(D, \mathbf{x}) = \hat{\mathbb{E}}[Y \mid D, \mathbf{x}]$ and measures how much predictions change when demographics are replaced with the population mean $\bar{\mathbf{x}}$:
Interpretation: The numerator captures variance in predictions attributable to demographics. If demographics have no influence on g's predictions, this ratio is 0 and F = 1 (perfectly fair).
Pros: Directly measures demographic influence; intuitive interpretation.
Cons: Requires choosing a reference demographic profile $\bar{\mathbf{x}}$; may be sensitive to model architecture.
Option B: Demographic Disparity
Fits the baseline $g_0(D)$ (demand-only), computes residuals $R = Y - g_0(D)$, then regresses residuals on demographics:
Interpretation: If demographics explain residual variance (high R²), that indicates demographic bias. A high F score means residuals are independent of demographics.
Pros: Simple; directly tests residual independence; no need for reference profile.
Cons: Two-stage estimation may lose efficiency; R² can be unstable with high-dimensional demographics.
Current Status
We are currently developing and tuning the $g(D, \mathbf{x})$ estimator using the demographic_explorer.py dashboard. Key considerations include:
- Model architecture: Polynomial features for D, Ridge/Lasso/ElasticNet regression
- Feature engineering: Demographic interactions, spatial aggregations
- Cross-validation: Leave-One-District-Out (LODO) to assess generalization
- Multicollinearity: VIF analysis to handle correlated demographic features
FAMAIL Project | Fairness-Aware Multi-Agent Imitation Learning
San Diego State University · Computer Science